Cosine Similarity Metric Learning for Face Verification

نویسندگان

  • Hieu V. Nguyen
  • Li Bai
چکیده

Face veri cation is the task of deciding by analyzing face images, whether a person is who he/she claims to be. This is very challenging due to image variations in lighting, pose, facial expression, and age. The task boils down to computing the distance between two face vectors. As such, appropriate distance metrics are essential for face veri cation accuracy. In this paper we propose a new method, named the Cosine Similarity Metric Learning (CSML) for learning a distance metric for facial veri cation. The use of cosine similarity in our method leads to an e ective learning algorithm which can improve the generalization ability of any given metric. Our method is tested on the state-of-the-art dataset, the Labeled Faces in the Wild (LFW), and has achieved the highest accuracy in the literature. Face veri cation has been extensively researched for decades. The reason for its popularity is the non-intrusiveness and wide range of practical applications, such as access control, video surveillance, and telecommunication. The biggest challenge in face veri cation comes from the numerous variations of a face image, due to changes in lighting, pose, facial expression, and age. It is a very di cult problem, especially using images captured in totally uncontrolled environment, for instance, images from surveillance cameras, or from the Web. Over the years, many public face datasets have been created for researchers to advance state of the art and make their methods comparable. This practice has proved to be extremely useful. FERET [1] is the rst popular face dataset freely available to researchers. It was created in 1993 and since then research in face recognition has advanced considerably. Researchers have come very close to fully recognizing all the frontal images in FERET [2,3,4,5,6]. However, these methods are not robust to deal with non-frontal face images. Recently a new face dataset named the Labeled Faces in the Wild (LFW) [7] was created. LFW is a full protocol for evaluating face veri cation algorithms. Unlike FERET, LFW is designed for unconstrained face veri cation. Faces in LFW can vary in all possible ways due to pose, lighting, expression, age, scale, and misalignment (Figure 1). Methods for frontal images cannot cope with these variations and as such many researchers have turned to machine learning to 2 Hieu V. Nguyen and Li Bai Fig. 1. From FERET to LFW develop learning based face veri cation methods [8,9]. One of these approaches is to learn a transformation matrix from the data so that the Euclidean distance can perform better in the new subspace. Learning such a transformation matrix is equivalent to learning a Mahalanobis metric in the original space [10]. Xing et al. [11] used semide nite programming to learn a Mahalanobis distance metric for clustering. Their algorithm aims to minimize the sum of squared distances between similarly labeled inputs, while maintaining a lower bound on the sum of distances between di erently labeled inputs. Goldberger et al. [10] proposed Neighbourhood Component Analysis (NCA), a distance metric learning algorithm especially designed to improve kNN classi cation. The algorithm is to learn a Mahalanobis distance by minimizing the leave-one-out cross validation error of the kNN classi er on a training set. Because it uses softmax activation function to convert distance to probability, the gradient computation step is expensive. Weinberger et al. [12] proposed a method that learns a matrix designed to improve the performance of kNN classi cation. The objective function is composed of two terms. The rst term minimizes the distance between target neighbours. The second term is a hinge-loss that encourages target neighbours to be at least one distance unit closer than points from other classes. It requires information about the class of each sample. As a result, their method is not applicable for the restricted setting in LFW (see section 2.1). Recently, Davis et al. [13] have taken an information theoretic approach to learn a Mahalanobis metric under a wide range of possible constraints and prior knowledge on the Mahalanobis distance. Their method regularizes the learned matrix to make it as close as possible to a known prior matrix. The closeness is measured as a Kullback-Leibler divergence between two Gaussian distributions corresponding to the two matrices. In this paper, we propose a new method named Cosine Similarity Metric Learning (CSML). There are two main contributions. The rst contribution is Cosine Similarity Metric Learning for Face Veri cation 3 that we have shown cosine similarity to be an e ective alternative to Euclidean distance in metric learning problem. The second contribution is that CSML can improve the generalization ability of an existing metric signi cantly in most cases. Our method is di erent from all the above methods in terms of distance measures. All of the other methods use Euclidean distance to measure the dissimilarities between samples in the transformed space whilst our method uses cosine similarity which leads to a simple and e ective metric learning method. The rest of this paper is structured as follows. Section 2 presents CSML method in detail. Section 3 present how CSML can be applied to face veri cation. Experimental results are presented in section 4. Finally, conclusion is given in section 5. 1 Cosine Similarity Metric Learning The general idea is to learn a transformation matrix from training data so that cosine similarity performs well in the transformed subspace. The performance is measured by cross validation error (cve). 1.1 Cosine similarity Cosine similarity (CS) between two vectors x and y is de ned as: CS(x, y) = x y ‖x‖ ‖y‖ Cosine similarity has a special property that makes it suitable for metric learning: the resulting similarity measure is always within the range of −1 and +1. As shown in section 1.3, this property allows the objective function to be simple and e ective. 1.2 Metric learning formulation Let {xi, yi, li}i=1 denote a training set of s labeled samples with pairs of input vectors xi, yi ∈ R and binary class labels li ∈ {1, 0} which indicates whether xi and yi match or not. The goal is to learn a linear transformation A : R → R(d ≤ m), which we will use to compute cosine similarities in the transformed subspace as: CS(x, y,A) = (Ax) (Ay) ‖Ax‖ ‖Ay‖ = xAAy √ xTATAx √ yTATAy Speci cally, we want to learn the linear transformation that minimizes the cross validation error when similarities are measured in this way. We begin by de ning the objective function. 4 Hieu V. Nguyen and Li Bai 1.3 Objective function First, we de ne positive and negative sample index sets Pos and Neg as:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quasi Cosine Similarity Metric Learning

It is vital to select an appropriate distance metric for many learning algorithm. Cosine distance is an efficient metric for measuring the similarity of descriptors in classification task. However, the cosine similarity metric learning (CSML)[1] is not widely used due to the complexity of its formulation and time consuming. In this paper, a Quasi Cosine Similarity Metric Learning (QCSML) is pro...

متن کامل

Some topics on similarity metric learning

The success of many computer vision problems and machine learning algorithms critically depends on the quality of the chosen distance metrics or similarity functions. Due to the fact that the real-data at hand is inherently taskand data-dependent, learning an appropriate distance metric or similarity function from data for each specific task is usually superior to the default Euclidean distance...

متن کامل

L2-constrained Softmax Loss for Discriminative Face Verification

In recent years, the performance of face verification systems has significantly improved using deep convolutional neural networks (DCNNs). A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images. The so...

متن کامل

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

Crystal Loss and Quality Pooling for Unconstrained Face Verification and Recognition

In recent years, the performance of face verification and recognition systems based on deep convolutional neural networks (DCNNs) has significantly improved. A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010